GPT-4 Overview

This section explains the latest techniques for using GPT-4 effectively, including helpful tips, applications, and limitations. You’ll also find suggestions for further reading.

What is GPT-4?

GPT-4 is the newest AI model from OpenAI. It’s a large model that can understand both text and images, but its main output is text. GPT-4 performs very well on various professional and academic tests, often at a human level.

Test Performance:
- GPT-4 ranks in the top 10% on a simulated bar exam.
- It also does well on tough benchmarks like MMLU and HellaSwag.

OpenAI improved GPT-4 based on feedback from testing and by learning from their previous model, ChatGPT. This has led to improvements in accuracy, adaptability, and overall performance.

What is GPT-4 Turbo?

GPT-4 Turbo is an enhanced version of GPT-4. It’s faster, follows instructions better, and can handle things like generating JSON, working with large amounts of data (up to 128,000 tokens), and calling multiple functions at the same time. It’s available for developers via API.

Vision Capabilities

Right now, GPT-4’s API only supports text inputs, but there are plans to allow image inputs soon. Compared to GPT-3.5, GPT-4 is more reliable and creative. It can handle complex tasks in a variety of languages. For example, if you give it a visual chart along with a text prompt, GPT-4 can process both and provide detailed answers.

Example of GPT-4 Handling Image and Text Together:

You can ask GPT-4 to analyze an image, like a chart showing meat consumption, and answer questions based on it. If asked, “What is the total average daily meat consumption for Georgia and Western Asia?”, GPT-4 would walk through the steps:

1. Find Georgia’s daily meat consumption.
2. Find Western Asia’s daily meat consumption.
3. Add the two values together.

By following these steps, GPT-4 gives the correct result.

Steering GPT-4

You can control GPT-4’s responses by setting guidelines through system messages. For example, if you need data in a specific format, like JSON, you can tell GPT-4 to always return its responses in that format. It will then follow those instructions for all future responses, giving consistent results.

Example:
System Message: "Always write your response in JSON."
User: "Give me 10 sample texts with their sentiment labels."

GPT-4 would respond with a JSON output that includes text samples and whether they have positive or negative sentiment.

Text Generation with GPT-4

GPT-4 is great for tasks like:
- Writing documents or code
- Answering questions about specific topics
- Translating languages
- Teaching or tutoring on different subjects
- Creating chatbots or characters for games

Chat Completions

GPT-4’s API can be used for conversations, where the model remembers the context of the chat and can respond appropriately. The system messages guide the model's responses, helping it follow specific instructions during the conversation.

JSON Mode

When working with APIs, you can instruct GPT-4 to always return valid JSON responses by setting it to JSON mode. This prevents errors and ensures that the model sticks to a specific format.

Reproducible Outputs

GPT-4’s responses are usually not deterministic, meaning they can change slightly with each request. However, by using a "seed" value, you can make the responses more predictable. OpenAI also provides a "system fingerprint" that tracks changes in the model that might affect output consistency.

Function Calling

GPT-4 can generate structured responses, such as JSON outputs, to help call functions in your code. For example, you could ask the model to get the weather in three locations at once, and it will provide the necessary data in a format you can use directly in your application.

Common Use Cases for Function Calling:
- Creating assistants that call APIs (e.g., weather services)
- Turning natural language into API calls
- Extracting data from text, like dates or names

Limitations of GPT-4

GPT-4 isn’t perfect and sometimes makes mistakes or "hallucinates" answers. It's best not to use it for critical tasks where accuracy is vital. For example, on the TruthfulQA test, GPT-4 outperforms GPT-3.5, but it still struggles with certain types of reasoning.

Example of Failure:
When asked who invented rock and roll, GPT-4 might incorrectly say Elvis Presley instead of the correct answer, Chuck Berry.

Improving GPT-4’s Results

To get better answers, you can give GPT-4 more specific instructions, like asking it to think step-by-step before answering. For high-stakes tasks, consider using external knowledge sources to improve its accuracy.